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Abstract 

A methodology is described whereby a linear ADT may 
be rigorously encapsulated within a state monad. A 
CPS-like translation from the original ADT axioms into 
monadic ones is also described and proven correct, so 
that reasoning can be accomplished at the monadic level 
without exposing the state. The ADT axioms are suit¬ 
ably constrained by a linear type system to make this 
translation possible. This constraint also allows the 
state to be “updated in place,” a notion made precise 
via a graph-rewrite operational semantics. 

1 Introduction 

In recent years, numerous proposals for I/O, destruc¬ 
tive updates to data structures, mutable variables, non¬ 
determinism, and concurrency have been put forth, all 
using monads to structure programs in such a way that 
details of the computation are effectively hidden and 
encapsulated [18, 20]. One of the most important uses 
of monads is in dealing with state, resulting in a style of 
programming referred to appropriately by Peyton Jones 
and Wadler as imperative functional programming [16]. 
In this style, a state monad is used to encapsulate the 
state in an abstract datatype, thus avoiding the “plumb¬ 
ing” usually associated with pure functional programs 
that manipulate state. Access to the hidden state is 
achieved by abstract operations that are sequenced via 
a small set of monadic combinators, typically called bind 
and unit. This style of programming has been adopted 
wholeheartedly by the Haskell [8] community, and its 
utility has been verified through many non-trivial ap- 
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plications. 

Surprisingly, however, little work has been done in 
establishing a formal connection between monads and 
state. In particular: 

• There are few formal connections between mon¬ 
ads and single-threaded state, where the property 
of being single-threaded is precisely what is needed 
to guarantee safe, “in-place” update. The sever¬ 
ity of this problem is easily demonstrated by an 
example: For any state-transformer monad with 
type M a = S —> (S, a), there is a trivial oper¬ 
ation getState :: M S that will instantly destroy 
any hope for single-threadedness: 

getState s = (s, s) 

• Nor are there many formal connections between 
monads and the state-manipulating operations them¬ 
selves. That is, the standard monad operations tell 
us how the state-manipulating operations may be 
sequenced, but say nothing about what they do, 
or how they are derived. 

• Finally, although laws for reasoning about several 
specific monads exist [18, 15], there is no general 
way to derive these laws. 

To solve these problems, we outline a methodology 
that allows one to begin with a conventional axioma- 
tization of a type of interest, rigorously encapsulate it 
within a monad, and reason about it abstractly in a 
monadic fashion. This is only possible if the axiomati- 
zation satisfies a certain linearity condition, established 
using linear types [6, 9, 4, 19, 5]. This same condition 
allows us to prove that the type of interest (i.e. the 
state) may be updated in place. 

Our combination of linear types and monads unveils 
a connection between two very different methods for 




dealing with state in a functional language. It is inter¬ 
esting that a CPS-like conversion, with strong ties to 
notions of sequentiality, formalizes this connection. 

From a pragmatic perspective, the methodology al¬ 
lows one to use conventional ADT techniques to de¬ 
sign a type of interest and operations on it. The use 
of the linear type system is confined just to the ADT 
axioms. Once the axioms type-check, the linear type 
system never needs to be dealt with again. In other 
words, the monad encapsulates not just the ADT, but 
its linearity as well. 

The methodology is also easily implementable (al¬ 
though we have not yet done so). The graph-rewrite se¬ 
mantics is quite similar to conventional graph reduction, 
the basis of many functional language implementations. 
Moreover, the monadic axioms can be derived automat¬ 
ically, yielding a source of program transformation rules 
for compile-time optimization. These rules may also be 
used by semantics-directed compilers and interpreters 
that are based on monads [12]. 

An Example. Consider a simple integer list ADT, 
with operations: 

nil :: IntList 

cons :: Int ^ IntList IntList 

nth-select :: Int ^ IntList^ Int 

nth-update :: ( Int, Int ) —> IntList —> IntList 

These operations are axiomatized by: 

nth-select 0 ( cons x xs) => x 

nth-select ( i + 1) (cons x xs) => nth-select i xs 

nth-update (0,r>) (cons x xs) => cons v xs 

nth-update (i + 1, v) (cons x xs) 

=> cons x ( nth-update ( i,v ) xs) 

This axiomatization is well typed with respect to the 
linear type system described in Section 3. Consequently, 
a mutable version of this ADT can be derived automat¬ 
ically, with operations encapulated in a monad having 
types: 

nilM :: M Int^> Int 

consM :: 7ni—>M() 

nth-selectM :: Int —► M Int 

nth-updateM :: (Int, Int) —> M () 

The following small program which manipulates this 
mutable ADT: 

nilM 

( consM 2 » 

consM 1 

nth-updateM (1,5) » 
nth-selectM 1 ) 


returns the value 5. Furthermore, using the graph- 
rewrite semantics of Section 5, nth-updateM will up¬ 
date the list in-place (in contrast with the original ADT, 
which must recreate the first n elements). In this exam¬ 
ple the benefits are only constant-factor amortized im¬ 
provements in time and space, but in other examples the 
improvements are more dramatic (see Appendix A for 
the design of an array ADT, where a linear-factor im¬ 
provement is realized). Even constant-factor improve¬ 
ments may be important in certain situations, for exam¬ 
ple if the improvement can reduce load on the garbage 
collector (we will show later that nth-updateM induces 
only a constant number of heap or stack allocations). 

Finally, a new set of axioms for the ADT can be auto¬ 
matically obtained by applying the translation scheme 
in Section 4.3 to the original axioms above, yielding: 

consM x » nth-selectM 0 > Az —» to 
=> consM x m[x/z\ 
consM x » nth-selectM (i + 1) Xz —> to 
=> nth-selectM i » Xz —> consM x > rn 
consM x » nth-updateM (0, v) 

=> consM v 

consM x » nth-updateM (i + 1, v) 

=> nth-updateM (i,v) consM x 
consM x » unit y => unit y 
nilM (unit y) => y 

(A> and are the monadic hind and seq operators, 
respectively.) Note that these axioms allow reasoning 
about integer lists at the level of monads, without ex¬ 
posing the underlying representation. 

2 Preliminaries 

Define the language £ as the lambda calculus extended 
with constants, which include primitive datatypes and 
operations on them, such as the unit type (), integers, 
booleans, and pairs. Its syntax is given in Figure 1. 
In addition, for convenience we will often use a non¬ 
recursive let expression defined by: 

let x = ei in e2 = (Ax —* e 2 ) e\ 

The only reduction rules that interest us at this stage 
are the conventional f3 rule: 

(Ax -> e) y e[y/x\ 

and the 5 rules that govern the behavior of conditional 
expressions, primitive datatypes, and recursion. For the 
conditional and pairing, we have: 

if True e\ e 2 =>s ei 
if False e\ e 2 =>s e 2 
fst (ei,e 2 ) =>5 ei 
snd (ei,e 2 ) =>,5 e 2 
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x £ Var (variables) 
k £ Con (primitive constants) 


I k 

| ei e 2 
I (ei,e 2 ) 

| Xx —> e 

Figure 1: Syntax of £ 


Note that the above rules imply a non-strict semantics. 
For recursion, we assume fix £ Con with S rule: 

fix f / (fix /) 

Define the relation => as the union of =>p and =>g. 
We write e\ JJ-cbjv e 2 to denote the reduction of e\ to 
e 2 using => under a call-by-name (leftmost, outermost) 
reduction strategy. Evaluation of an expression e is de¬ 
fined as reduction of e to a value v, defined as: 1 

v::=k | (m,u 2 ) 

Evaluation is a partial function. 

2.1 Abstract Datatypes (ADTs) 

Definition 1 Given a type of interest T, a simple ADT 
is one in which each operation can be classified as either 
a generator of type X\ —> T, a modifier of type X 2 —> 
T —> T, or a selector of type X 3 —> T —> X 4 , where the 
Xi are arbitrary auxiliary datatypes. 

This is a standard classification for ADTs, and most 
conventional ADTs such as arrays, lists, stacks, and 
queues can be expressed in this way. However, it does 
not include tree-shaped ADT’s, since a modifier for such 
an ADT would need a type such as X —> T —> T —> T. 
This limitation is discussed further in Section 6. 

We allow £ to be extended with simple ADTs by 
extending the base types as necessary and using the 
ADT axioms as <5 rules in the calculus. We assume that 
adding new axioms does not destroy confluence (which 
can also be achieved by choosing a determinate order 
on the delta rules, as is done in most modern functional 
languages). For an ADT A we refer to the extended 
language as £- 4 . 

Definition 2 Given a simple ADT with a set of gener¬ 
ators G, modifiers M, and selectors S, an axiomatization 
distinguishes a set D = S U U, where U C M. Each 
operator d £ D has one or more defining axioms whose 
syntax is given in Figure 2. 

1 Evaluation can be generalized to include abstractions as values, 


This definition subdivides modifiers into two categories: 
U is the set of mutators, each having defining axioms; 
and the rest (M\ U) are called constructors, whose be¬ 
havior is defined implicitly by the axioms. For example, 
for the integer list ADT in the introduction, nil is a gen¬ 
erator, cons is a constructor, nth-update is a mutator, 
and nth-select is a selector. 

Note that the syntax in Figure 2 restricts the axioms 
to be first-order. 

3 The Linearity Condition 

Not every ADT can be converted to monadic form; it 
must be “linear,” or “single-threaded,” in some sense. 
In this section we describe a linear type system which 
constrains the ADT sufficiently to allow the conversion. 
It is inspired by the linear type system with read-only 
access in [19], which in turn was influenced by the single¬ 
threadedness condition in [17]. 

A type in our system can be a basic type, a pair of 
auxiliary types, a function type, or the type of interest 
T. In addition, it could be a linear type of interest, 
denoted j T. Formally: 2 

u, v ::= Int \ Bool \ 

I ( u , v) | (u —> v) 

| T I \T 

Facilitated with this finer-grained type system, we 
are able to type each class of ADT operators to better 
reflect the intended use of T: 

9 :: X 1 ^\T 

m :: X 2 —> j T —> j T 
s :: X 3 ^T^X 4 

Generators and modifiers manipulate j T to ensure that 
at any time they either create or have the sole pointer 
to the type of interest. On the other hand, selectors 
take a non-linear T as an argument because they only 
read the type of interest and, as a result, the linearity 
condition can be relaxed. 

2 In this work, the linearity of the type of interest is all that we 
care about, so it would be a waste of ink if we follow the convention 
of linear logic and tag each nonlinear type with !. Instead we tag the 
linear type of interest with a j as is done in [19]. We pronounce jT as 
“squirt T” (gentler than a bang). 
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axiom 

::= Ihs => rhs 

9 

€ 

G 

(generators) 

Ihs 

pat 

::= m x pat s x pat 
::= t \ g x c x pat 

s 

m 

c 

e 

e 

ft 

S 

u 

M\ U 

(selectors) 

(mutators) 

(constructors) 

rhs 

::= g\s\m\c\x\t\k 
( rhsi,rhs2 ) rhs\ rhs 2 
if rhsi then rhs2 else rhs 3 

X 

e. 

Var 

(identifiers of auxiliary type) 

t 

k 

e 

e 

Var 

Con 

(identifiers of type of interest) 
(constants) 


Figure 2: Axiom Syntax Scheme 


An assumption list associates variables and constants 
with types: 

A ::= xi : ui,..., x n : u n 

where x$,...,X2 are distinct variables or constants. The 
base assumption list B^ for an ADT A associates all 
the operators of A, primitives, and constants with their 
types. We write A , B to denote the concatenation of two 
assumption lists; an implicit side condition of a concate¬ 
nation is that A and B contain distinct variables. 

R(A) is an assumption list identical to A except that 
all the associations of mutators are elided, and those of 
generators, constructors, and variables of type \T are 
“non-linearized.” That is, the j’s in their type signa¬ 
tures are removed. Intuitively, R prepares a “read-only” 
assumption list for an expression which may do multiple 
reads on the type of interest. 

The typing rules are listed in Figure 3. The predicate 
NL(u) (“non-linear u”) checks that u does not have a 
linear type of interest as a component. It is defined by: 

NL(u) <^> u±\T and 

u = (v, w) => NL(v) and NL(w) 

The predicate HY{u) (“hygienic u” ) checks that u is not 
a function type and does not have any type of interest, 
either linear or nonlinear, as a component. It is defined 
by: 

HY(u) <=* u^v^w and 
u^\T and 
u ± T and 

u = (v, w) => HY(v) and HY{w) 

Note that our system allows (Weakening) on any 
variable because the “no discarding” property of a lin¬ 
ear type of interest is not of our concern. This design 
in turn enables the use of only one (Identity) rule. On 
the other hand, (Contraction) can only be applied to a 
variable of non-linear type because the “no duplicate” 
property of a linear type of interest is the theme of 
the whole type system. There are three application 
rules: (jT-Application) is identical to the application 


rules in other linear type system except that here we 
constrain the resulting type of the application to be 
j T; (T-Application) allows read-only access in the ar¬ 
gument term whose type is T, and the hygienic condi¬ 
tion ensures that no type of interest can be smuggled 
out (a side-effect of this constraint is that the resulting 
type can not be a function type); (Application) han¬ 
dles applications whose operands are of hygienic type. 
Also note that read-only access is used when typing the 
predicate of a conditional expression. 

Definition 3 An ADT A is linear if each axiom Ihs => rhs 
has the following properties: 

• There exists a type u and assumption list A such 
that: 

— B_ 4, A h Ihs : u, and 
— If u = j T, then B_ 4, A \- rhs : u\ 
otherwise, R (-B4, A) \- rhs : u. 

• Each free variable has exactly one occurrence in 
Ihs. 

As an example of a well-typed linear ADT, consider 
the integer list ADT given in the introduction. As an 
example of an ADT that would not be well-typed, con¬ 
sider adding the following (contrived) operator to the 
integer list ADT: 

silly l => let l' = nth-update (1,2) l 
in nth-select 1 l' 

This is not linear because silly is a selector, but there 
is an update to the type of interest on the right-hand 
side, which our type system does not allow. 

4 Monadic ADTs 

To say that an ADT is linear is to say something about 
its axioms. But if we naively implement a linear ADT 
in a functional language based on a Hindley-Milner type 
system, the linear axioms won’t buy us much: we still 
will not be able to do in-place updates on the type of 





(Identity) 


A,y:u,z:u,B\-e:v NL (u) 
A, x : u, B r e[x/y,x/z] : v 


(Contraction) 


■ (Weakening) 


A, x : u, y : v, B b e : t 


A, x : uh e : v A,y : v,x : u,B h e : t 

R (A) h p : Bool A b ei : u A h e 2 : u 


(Exchange) 


(If) 


A h if p then e\ else e 2 
R (A) \- ei :u R (A) \-e 2 :v HY (u) HY (v) 
A h (ei,e 2 ) : (u,v) 

AhemT^iT B\~e:\T 
A,Bh ei e: \T 

A h ei : T -> v R (B) h e 2 : T HY (v) 

A, B \- e\ e 2 : v 

A h ei : u -*■ v B h e 2 : u HY (u) 


(Pair; 


A, B h ei e 2 : v 
Figure 3: Linear Typing Rules 


(jT- Application) 

(T-Application) 
(Application) 


interest because the type system is not strong enough 
to guarantee that the argument to an ADT mutator is 
unshared. For example, the program: 

let h = cons 1 (cons 2 nil) 
l 2 = nth-update (1,5) h 
in nth-select 1 l 2 + nth-select 1 h 

should return 7. But if the update were done destruc¬ 
tively, the result might be 10, which is clearly wrong. 

We could, of course, enrich the type system with 
linear types throughout the language, in which case the 
above program would be ill-typed (at least one func¬ 
tional language, Clean, uses this approach [2]). We 
choose instead a different solution: we will encapsulate 
the linear ADT in a monad, and use the monadic struc¬ 
ture to enforce linearity of the type of interest through¬ 
out the language. 

4.1 Monad Basics 

Definition 4 A monad is a triple (M, 3>, unit ) where 
M is a type constructor, and 3> and unit are functions 
which satisfy the following three laws: 

unit a » A b —> n = n[a/b ] 
n » unit = n 

n i (A a —> n 2 » A b —> 713) = (ni » A a —> n 2 ) 
> A i) -> n.i 


In the last law, a does not appear free in 723. 


Operationally speaking, M a is a computation which 
produces an answer of type a when executed; » com¬ 
bines two computations into a larger one which per¬ 
forms the two in sequence; and unit injects a single value 
into a computation. Of particular interest to us is the 
state monad which abstracts over a type of interest to 
be treated as mutable state: 


type M a 

> 

unit 

(n > k) t 
(unit a) t 


T —> (a,T) 

Ma—>(a^>Mb)—*Mb 

a^Ma 

letp=nt in k (fst p) (snd p) 
(a,t) 


It can be easily verified that these definitions satisfy the 
three monad laws. 

For convenience, we specialize a version of which 
we write 


(ni n 2 ) t => n 2 (snd (mi f)) 

7§j and are like Haskell’s >>= and >>, respectively, 
and are sometimes called bind and seq. 
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4.2 Monadic Encapsulation of ADT’s 

For each modifier m :: X 2 —> T —> T and selector s :: 
X3 —> T —> X4 in an ADT A, we introduce monadic 
counterparts mM and sM, respectively, defined in terms 
of m and s: 

mM :: W 2 -> M () 

sM :: X3 — MX4 

mM x t => ((), m x t) 
sM x t => (s x t, t) 

The typing M () which results from applying mM im¬ 
plies that the modifier only effects the type of interest 
and does not return anything interesting. 

In addition, for each generator g :: X\ —> T in an 
ADT A we define a monadic generator gM by: 

gM :: a 

gM x m =>■ fst (m (g x)) 

This definition reflects the use of gM as the operator 
which invokes a computation: it applies the monadic 
computation to the type of interest generated by the 
original ADT generator and returns the first element 
of the resulting pair of the computation while discard¬ 
ing the second element, the (possibly altered) type of 
interest. 

4.3 Deriving Monadic Axioms 

The encapsulation just described leads to a monadic 
implementation of a linear ADT, and we will see in a 
later section that it allows in-place update of the type 
of interest. But it is not an especially good mechanism 
for reasoning about a monadic ADT, since it reveals the 
details of the encapsulation, and, once the original ADT 
is exposed, reverts to reasoning in terms of that ADT. It 
would be better if we had monadic axioms, correspond¬ 
ing to the original linear ones, that did not expose the 
underlying ADT representation. In this section we de¬ 
scribe a method for deriving such axioms from those in 
the original linear ADT. 

Our translation scheme resembles, perhaps not sur¬ 
prisingly, a CPS conversion, and is related to call-by¬ 
value monad transformation [13, 18]. But it is also dif¬ 
ferent from these in several ways, the two most impor¬ 
tant being: (1) the conversion is performed on axioms 
rather than just terms, and (2) the conversion is done 
with respect to a particular value—the type of interest—■ 
and in fact is not even valid for non-linear ADTs. 

In the following, we use [• • •] and (• • •) to quote and 
un-quote, respectively, the text being translated. First 
we define a translation function T> for expressions that, 
given an expression e and a continuation k, returns an 


equivalent expression using the monadic ADT: 

V |~s (a;) (i)] k = V t (D x \Xy —> sM y » (fc)]) 

V \m (x) (t)] k = V t {V x \Xy mM y » (fc)]) 

V\g{x)]k = \gM{x)(k}] 

V lif {p) then (c) else (a)] k 

= V p [Aa; —> if x then (V c k) else {D a fc}"| 

V r«e 1 ),<e 2 »l k 

= T> ei \Xy -> (D e 2 \Xz -*■ (unit ( y,z )) > (k)]}] 

V \f < ei ) (e 2 )l k 

= V a \\y -4 (T> e 2 \\z -*• (unit (/ y z)) > (k)]) 

V\t\k = k 

V \x] (A y ->n) = n[x/y} 

V |Y| (A y -* n) = n[c/y\ 

Here s, m, and g represent an arbitrary selector, modi¬ 
fier, and generator, respectively; sM, mM, and gM are 
their monadic counterparts; t is an identifier whose type 
is the type of interest; x is an identifier having auxil¬ 
iary type; / is a primitive function; and c is a constant 
having auxiliary type. We assume suitable a-renaming 
to avoid name clashes with y, z, and k. 

Given V, we can now define translations S and M for 
axioms that define selectors and mutators, respectively. 

S p (aq) (t) => (x 2 )] 

= '{V -,s- (^,) <f)l f(Ay-r0l> 

=> (D x 2 \Xy -> (V t fnl)]| ] 

M\m{x) (ti) => (t 2 )"| 

= \{V\m {x) (h)] \k]) => (D t 2 \k]) 1 

Here also we assume suitable a-renaming to avoid name 
clashes with y, z, and k. Note that S translates the 
right-hand side in the context of the nested type of in¬ 
terest on the left-hand side, whereas M translates both 
sides independently. Also note that if an axiom is de¬ 
fined in terms of type of interest identifier t, then the 
type of interest disappears during the translation (thus 
completing our mission of hiding the state!). 

In addition to the axioms derived from the original 
ADT axioms, we also need to axiomatize: 

• the interactions between generators and unit. For 
each generator gM, add the rule: 

gM (unit y) => y 

• the interactions between constructors and ~A>. For 
each constructor cM, add the rule: 

cM x » unit y => unit y 

An example of the overall translation is the monadic 
axioms for the integer list ADT given in the introduc¬ 
tion. It is interesting to note in this translation process 
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that mutators and selectors trade roles! This is an ar¬ 
tifact of the CPS-like nature of the translation, which 
uncovers a “duality” between constructors and selec¬ 
tors, much like the duality noted by Filinski in [3]. 

Calling the above translation scheme T, its correct¬ 
ness is captured by its relationship to the monadic en¬ 
capsulation previously defined: 

Theorem 1 The monadic encapsulation of a linear AD T 
A satisfies the monadic axiomatization derived from A 
using translation T. 

Proof: By applying the translation scheme to a 
generic axiom of each kind, then using equational rea¬ 
soning to unfold the monadic encapsulation on each side 
and reduce them to equivalent terms. A proof for one 
of the three axioms is given in Appendix B. 

5 GRS Operational Semantics 

In [7] we informally defined: 

Definition 5 A mutable ADT, or MADT, is any ADT 
whose operational semantics permits in-place update (i.e. 
destructive reuse ) of the type of interest, while still re¬ 
taining confluence. 

We will now make this definition more precise. To do so, 
we need an operational semantics that captures sharing 
and identifies all relevant costs of execution. The eval¬ 
uation function ficBN is not suitable for this purpose 
since it is essentially a term-rewriting system. Instead, 
we choose a graph -rewriting system (GRS) to capture 
operational semantics. Although other approaches to 
operational semantics abound—such as recent efforts at 
modeling lazy evaluation [10, 1, 14]—we find a GRS as 
particularly clear (a picture is worth a thousand words), 
abstract (no heap, for example), expressive (capturing 
notions of sharing, updating, and sequencing), and in¬ 
tuitive (resembling conventional graph reduction). 

Figure 4 defines a function lift which maps terms 
of £ into their corresponding graphs (called C-graphs). 
For each CBN 6 rule, a corresponding GRS rule can be 
obtained by lifting both sides of the rule. For example, 
Figure 5 shows the GRS rules for fst and snd. The GRS 
rule corresponding to the CBN (3 rule is also shown in 
Figure 5, where the notation e{y/x} denotes the result 
of redirecting the edges which point to the bound vari¬ 
able x in the subgraph e to the subgraph y: thus y is 
shared (also, as in [21], we assume that a fresh copy of 
the function body e is created before the subsitution is 
made). 

We write e\ firms e-2 to denote the reduction of e\ 
to e2 using the GRS-lifted reduction => under a call- 
by-name (leftmost, outermost) reduction strategy. The 


evaluation function JJ-gas can be seen as the call-by-need 
equivalent of the call-by-name evaluator JJ-gbjv- 

Theorem 2 For every term e in C, 

e JJ-cbn v lift(e) JJ-grs v 

where v is a value. 

Proof: See [21]. 

5.1 Lifting of ADT’s 

The lifting operation can also be extended to an ADT 
by lifting each side of each axiom. For example, the 
GRS rules for the two nth-update axioms given in the 
introduction are shown in Figure 6. 

When implementing GRS rules—for example using 
conventional graph reduction techniques—all of the re¬ 
ductions are assumed to be “pure.” That is, other than 
the top vertex on each of the left- and right-hand sides, 
no vertices are assumed to be shared. For example, in 
the first nth-update rule of Figure 6, the top vertex on 
the left is assumed to be the same as the top vertex 
on the right, but the vertices that comprise the cons 
operation on the right are assumed to be “fresh” ver¬ 
tices, distinct from those on the left. This is necessary, 
of course, to preserve confluence, as we have discussed 
earlier. But this is also exactly where we would like 
to do better: if some vertices on the left are discarded 
as a result of applying the rule, there is no reason why 
they can’t be immediately re-used on the right, thus 
achieving “in-place update.” In those cases where we 
abandon the “pure” reduction strategy in a rule, we will 
annotate the vertices with names to indicate which are 
being reused, and how. 

5.2 GRS Rules for Encapsulated ADT 

The monad laws can also be lifted into the GRS, as 
shown in Figure 7, where t denotes the subgraphs rep¬ 
resenting state. Note in the rule for unit that the state 
t is single-threaded; i.e. t is reachable from the root of 
each side of the rule in exactly one way. On the other 
hand, the rule for is not single-threaded, at least 
not in this simplistic way: in reality it depends on the 
behavior of k and n. This observation hints that if: 

• a linear ADT operator is only used “monadically,” 
i.e. having type M a and only combined with other 
monadic computations via 3>, and 

• a fresh (unshared) copy of the type of interest is 
passed into a monadic computation when it is in¬ 
voked, 
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Figure 4: Lifting Terms to Graphs 

/\ 

fox y ^ fst ^ a snd ^ 

' A b /\ b 


Figure 5: GRS Rules for Application and Pair Selections 


/ (i+1,v) /\ 


' * A > 

nth-update ( i,v ) 


Figure 6: GRS Rules for nth-update 
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Figure 8: GRS Rules for Monadic ADT Operators 
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Figure 9: GRS Rules for nth-update Reusing Vertices 



then the single-threadedness of the type of interest can 
be guaranteed throughout the entire computation. 

We can achieve this effect by first lifting the encap¬ 
sulation strategy of Section 4.2 into the GRS, as shown 
in Figure 8. However, the GRS rule for mM will have 
one important twist: the mutator m that it exposes will 
be allowed to reuse the type of interest; i.e. in-place up¬ 
date is allowed. To see this concretely, Figure 9 shows 
the rules for nth-update, using the convention of vertex 
reuse described earlier. It is interesting to note that 
the second rule implies that no vertices are needed to 
realize the recursive call structure of nth-update ; that 
is, it is fully tail-recursive, or iterative. Thus, in our 
operational semantics, an application of nth-updateM 
requires allocation of exactly two vertices, regardless of 
the length of the list. 

There is one problem, however: the right-hand side 
of the rule for sM introduces sharing in t, but using CBN 
reduction there is no guarantee that the lifted graph of 
(s x t) on the right-hand side is reduced before some 
subsequent mutation to t. We can fix this problem by 
making fst and snd hyper-strict in the first component 
of their arguments. This is the only sensible way to 
ensure that there are no dangling references to the type 
of interest, and is similar to the use of a hyper-strict let 
expression in [19]. Thus we define new versions of the 
pair selectors, fstA and sndA: 

fstA (v,e) => 5 v 
sndA (v,e) =>g e 

where v is a value. We then use these to redefine > 
and »: 

(m » k) t => letp= mt ink (fstA p) (sndA p) 
(m » n) t => n {sndA (m t)) 

The GRS versions of these rules are exactly the same 
as those in Figure 7 except that fst is replaced by fstA 
and snd by sndA. 

We can see that using hyper-strict pair selectors re¬ 
stores single-threadedness by reasoning as follows: 

• If fstA (sM x t) is reduced before sndA {sM x t ), 
then t becomes unshared because fstA forces {sxt) 
to be fully evaluated. 

• Similarly, if sndA (sM x t) is reduced first, sndA 
likewise forces (s x t) to be fully evaluated, so t 
becomes unshared before it is further used. 

The GRS system was necessarily made more strict in 
order to achieve single-threadedness. Thus it is possible 
that some computations which do not contribute to the 
evaluation of the final answer are evaluated (and may 


not terminate). For example: 

gM a 

(sM C2 A> Ax —> 
sM c 3 > Ay —* 
unit y ) 

where x does not appear in c 3 . The GRS would evaluate 
the subgraph representing (s C2 (g ci)), whose value is 
obviously irrelevant to the final answer. Nevertheless, 
GRS still offers the beauty of lazy evaluation in many 
cases. For example: 

gM ci 

(mM C2 Ax —* 
mM c 3 3> Ay —> 
unit 1 ) 

Here the GRS returns 1 without creating even the initial 
copy of the type of interest, let alone performing the two 
mutations. 

5.3 Correctness of In-place Update 

Using our somewhat stricter notion of GRS evaluation, 
we can prove a correctness result for in-place update. 
Define C M as the language £ extended with any monad- 
ically encapulated, linear ADT M. 

Theorem 3 For every term e in C M , 

e JJ-cbn v lift(e) JJ-grs v 

where v is a value. 

Proof: See Apendix C. 

6 Discussion 

In this section we discuss limitations and possible ex¬ 
tensions of our methodology. 

Although we believe that the added strictness intro¬ 
duced in Section 5.2 is minimal, and that some added 
strictness is inherent in any single-threaded reduction 
system, we are investigating ways to improve this, such 
as dynamically tagging closures having no references to 
the type of interest. But it is not clear that such a small 
improvement is worthwhile. 

Polymorphism should be easy to handle, at the ex¬ 
pense of carrying around more type information. For 
example, the integer list ADT could be turned into a 
polymorphic list, but the monad type would have to be 
extended to something like M a b, where a is the type 
returned from the computation, and b is the type of the 
list elements. 



The linear type system proposed in Section 3 is lim¬ 
ited to first-order values, so that, for example, a selector 
cannot return a function as its result. We believe, how¬ 
ever, that the type system can be extended to handle 
abstractions, and we have worked out the preliminary 
ideas of such an extension. 

A more serious limitation of the current system is 
that it does not handle more than one mutable ADT. 
One possible solution to this is to introduce references, 
so that individual objects can be named and manipu¬ 
lated, and use the extension to the Hindley-Milner type 
system proposed in [11] to prevent the escape of refer¬ 
ences from the monadic computation. 

Related to this issue is the inability to handle tree¬ 
like structures, ruled out by our definition of a simple 
ADT. Rather than using references, we have been work¬ 
ing on a different solution to this problem, based on the 
following idea. 

Suppose we are designing a database-like ADT, whose 
axiomatization is given by: 

new :: DB 

insert :: Int —> DB —> DB 

remove :: Int —> DB —> DB 

remove a new = new 

remove a (insert b x) = if a = b then x 

else insert b (remove a x) 

This is clearly linear, so our methodology can be ap¬ 
plied, yielding a mutable ADT with monadic axioms, 
etc. However, it is not very efficient. So, as in a tradi¬ 
tional ADT design, we may wish to implement the ADT 
using binary search trees. The implementation details 
should be hidden, of course, and we do so below using 
Haskell module syntax: 

module DB (DB, new, insert, find, remove) where 
data DB = Leaf I Branch Int Tree Tree 

insert a Leaf = Branch a Leaf Leaf 

insert a (Branch b It rt) = 

if a<=b then Branch b (insert a It) rt 
else Branch b It (insert a rt) 

remove a (Branch b It rt) = 
if a==b 

then if lt==Leaf then rt else 
if rt==Leaf then It else 
let pred = largest It 

It’ = remove pred It 
in Branch pred It’ rt 

else if a<=b then Branch b (remove a It) rt 
else Branch b It (remove a rt) 
where largest (Branch x 1 r) = 

if r==Leaf then x else largest r 

The interesting thing about this solution is that the 
tree is used linearly in the implementation viewed as 


a set of axioms, only our type system is currently not 
strong enough to determine this. If it could be suit¬ 
ably strengthened, then this encapsulation of the ax¬ 
ioms should allow in-place update of the tree. 

7 Conclusions 

Using state monads to achieve safe destructive reuse of 
the state is by no means an innovation; several earlier 
efforts have already demonstrated its suitability. What 
have been missing in these previous efforts are a sys¬ 
tematic way to recognize when monads work and when 
they don’t, an operational semantics to formally reason 
about the correctness of in-place updates, a method for 
designing efficient state monads, and a way to reason 
about them abstractly. Our contributions take positive 
steps toward resolving these deficiencies. 
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A A Complete Example: Array ADT 

Consider a very simple, fixed-size integer array ADT, 
with operations: 

newArr :: ( Int,..., Int) —> Array 
update :: ( Ix,Int ) —* Array —► Array 

select :: lx —> Array —> Int 

whose array axiomatization is given by: 

select i (newArr (xi ,, x n )) => x, 

update ( i,y ) (newArr (x\,... ,Xi - ,x n )) 

=> newArr (x\,... ,y,... ,x n ) 

Intuitively, newArr xs, where xs is an n-tuple, creates 
an array of size n, each of whose elements is initialized 
to its corresponding component in xs-, update (i,v) a 
returns an array identical to a except that the value of 
the ith element is v: and select i a returns the value of 
the ith element of a. 

It can be verified that these axioms satisfy our linear¬ 
ity condition. The GRS rules for this ADT are given in 
Figure 10. Monadically encapsulating newArr, update, 
and select, we get: 

type M a = Array —► (a, Array) 
newArrM :: (Int,..., Int ) —> M a —> a 
updateM :: (Ix,Int)^M() 
selectM :: Ix—> M Int 

Their GRS rules are shown in Figure 11; and a pro¬ 
gramming example in this MADT is given in Figure 12. 
The correctness of destrutively reusing the array follows 
from Theorem 3. 
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Figure 10: GRS Rules for Array ADT 
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Figure 11: GRS Rules for Monadic Array MADT 


Finally, we apply the axiom translation scheme to 
obtain the following monadic MADT axiomatization: 

newArrM (... Xi ...) (selectM i > q) 

=> newArrM (... Xi ...) (g xf) 
newArrM (... Xi...) (updateM ( i , y) » d) 

=> newArrM (... y ...) d 
newArrM (... x ^...) (unit b) => b 

B The Proof of Theorem 1 

The proof is a tedious proof-by-cases on the formation 
of both sides of a linear axiom. In some cases, an induc¬ 
tion argument is necessary. We illustrate the main idea 
by going through one sample case. Consider a selector 
axiom sxxt = x 2 - To omit the uninteresting details, let 
us assume X\ and x 2 are two identifiers of proper aux¬ 
iliary types. By inspecting the typing rules in Figure 3 
and the definition of simple ADTs, we know that t can 
either be an identifier of type of interest, an application 
of some constructor (this case is proved inductively), or 
an application of some generator. If it is the last case, 
the axiom looks like 

s x i (g x 3 ) => x 2 . 

Applying S to this axiom, we get 

gM x 2 (sM x\ » Ay —> n) => gM x 2 n\x 2 ly ]• 


Unfolding gM, sM, and on the left-hand side of the 
axiom: 

gM x 3 (sM x\ Xy —> n) 

=>gM fst (sM X! > Xy n) (g x 3 ) 

=>» fst ((A p -*■ (Xy n) (fst (sM xx p)) 

(snd (sMx i p ))) (g x 3 )) 

=>/? fst ((Xy -*■ n) (fst (sMxx (g a; 3 ))) 

(snd (sM x i (< 72 : 3 )))) 

=>p fst (n\fst (sM xx (g x 3 ))/y] 

(snd (sMxx (g x 3 )))) 

^sm fst (n\fst (s xx (g x 3 ), (g x 3 ))/y\ 

(snd (sMxx (g x 3 )))) 
fst(n[x 2 /y] (snd (sM x 1 (g x 3 )))) 

=> s m fst (n[x 2 /y] (snd (s xx (g x 3 ), (g x 3 )))) 

=> s m fst (n[x 2 /y\ (g x 3 )) 

Unfolding gM on the right-hand side: 

gM x 2 n[x 2 /y] 

=>gM fst ( n[x 2 /y} (g x 3 )) 

This term is identical to the unfolding of the left-hand 
side and we are done. □ 

C The Proof of Theorem 3 

Following the development in [5], it can be shown that 
for any term e in C M : 

e JJ cbn v lift(e) JJ-gas v 

where the monadic mutators in GRS do updates non- 
destructively. 

So if we can prove that it is always safe to do in- 
place updates in GRS as an “optimization,” it follows 
that for a C M term e, its CBN evaluation yields the 
same result as its in-place update GRS evaluation. 



newArrM (0, ...,0) 

(■updateM (1,1) 

> 


- allocate new array and invoke a computation 

- set 1st element to 1 

selectM 1 

> 

\x —> 

- read first element and bind it to x 

selectM 1 

> 

Ay -> 

- read first element and bind it to y 

unit (x + y) 


) 

- return the sum of x+y (the value 2) 


Figure 12: A Simple Program Using Array MADT 


In order to prove that in-place update on a type of 
interest is safe, it is sufficient to show that a mutator 
can only update an unshared type of interest. More 
precisely, we want to prove that for a monadic term: 

gM e\ (ni > n 2 ... n* ... > n k ) 

the type of interest after reduction of sndA (n t) at any 
step i, where n is the composition of computations n\ 
through rii, is unshared. 

Base case: i = 0: the type of interest is supplied by 
the rule for gM, which by definition is always unshared. 

Induction hypothesis: the type of interest after re¬ 
duction of sndA (n t) at step i is unshared. 

Induction step: prove that the type of interest is 
unshared after reduction of sndA (n t) at step i + l. We 
proceed by case analysis of the operation at step i + l: 

• unit: The rule for unit shows that it simply passes 
on the type of interest without adding any point¬ 
ers. So by the induction hypothesis, it is still un¬ 
shared. 

• mM: The rule for mM shows that it applies m to 
the unshared type of interest and passes the result 
on. But our linearity condition guarantees that m 
handles the type of interest single-threadedly. So 
the type of interest is still unshared. 

• sM: The rule for sM adds a pointer to t in the first 
component of the pair; i.e. the reference to t in 
s xt. But the reduction rule for sndA reduces this 
value to normal form, and since it is also linear, it 
cannot have any dangling reference to t. sM also 
simply passes the type of interest on, but this is 
now the only reference, so the type of interest is 
still unshared. 

Thus by induction, the type of interest is always un¬ 
shared just after the reduction of sndA (n t). It follows 
that mM is always handed an unshared copy of the type 
of interest, and mM can safely perform an in-place up¬ 
date. This further implies that: 

e JJ cbn v *£=> lift(e) JJ -grs v 

where v is a value and GRS employs rules doing in-place 
update on the type of interest. □ 
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